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1 Introduction 


Negotiations of trade agreements are time-consuming and do not always reach an 
agreement. Understanding the determinants of successfully concluded agreements can 
help to identify drivers and potential pitfalls for future trade agreements. The 
formation of trade agreements may be difficult due to a lack of trust and communication 
difficulties arising from ethnic or cultural differences between potential members. ! 
Differences in cultural norms and expectations about the behavior of the other party can 
lead to misunderstandings and negatively affect negotiations.” Individuals from different 
cultural backgrounds differ in their level of trust, differ in how they act when confronting 
social dilemma such as, e.g., prisoner’s dilemma or contributing to public goods, and 
have differing degrees of willingness to punish others when they free-ride.* Establishing 
trust, escaping the prisoner’s dilemma of strategic trade policy and how to deal with free 
riders are well-known key problems of international trade negotiations. * More 
specifically, negotiation and bargaining styles differ across countries, and cultural 
‘Knack and Keefer (1997) find that countries which are ethnically more homogeneous have higher levels 
of trust. 

?Zou et al. (2009) show that individuals’ behavior depends on what they perceive to be the consensus or 
“common sense” view within their culture; for similar arguments see also Roth et al. (1991). Henrich 
(2000) and Henrich et al. (2001) show that behavior in the ultimatum game depends on the culture of the 
experiment subjects. 

3 Buchan et al. (2002) find that Japanese experiment subjects have a lower level of trust than their 
American counterparts. Gachter et al. (2010) and Herrmann et al. (2008) find significant differences in 
the willingness to punish non-cooperative players in experiments in different cultural backgrounds. These 
are not isolated findings: cross-cultural differences in behavior in trust games are corroborated in a meta- 
analysis by Johnson and Mislin (2011). 


1 Brander (1986) is probably the first one to characterize trade negotiations as an attempt to escape the 


prisoner’s dilemma of unilateral strategic trade policy. 


differences are more pronounced in bargaining settings.” Trade negotiations are 
particularly affected by cultural differences as they involve infrequent, high stakes 
interactions between often changing high-level politicians or bureaucrats where 
establishing trust and a common understanding may be difficult. Cultural differences 
may also reflect different preferences for policy outcomes in the countries’ populations, 
making it harder for negotiators to reach a consensus and hence successfully conclude 
a trade agreement. 

These cultural differences and associated costs are difficult to measure, particularly at a 
bilateral level between a large set of countries. We propose to use Spolaore and 
Wacziarg’s (2009) genetic distance, a measure of how genetically related populations 
are in terms of their last common ancestor, as a readily available proxy for 
communication and negotiation costs arising from differences in culture and norms as 
a determinant of trade agreements. Anthropologic studies have shown that genetic 
distance can help to identify common cultural groups, in addition to geographic 
distance and shared language, two measures of cultural difference routinely used in the 
trade literature.® Similarly, Desmet et al. (2011) and find that genetic distance correlates 
well with measures of cultural distances based on survey responses 

5 Roth et al. (1991) find that while subjects in different countries exhibit similar behavior in experimental 
markets, individual bargaining behavior varies considerably across countries. Gelfand et al. (2015) find 
that strategies which lead to successful negotiations in the United States are detrimental in Egypt. For a 
literature survey on cultural differences and negotiations, see Gelfand et al. (2012). 

ê For example, cross-cultural differences such as norms around kinship correlate with human genetic 


diversity, see Jones (2003). For a general introduction to the relationship between human genetic and 
cultural diversity, see Stone and Lurquin (2007). 


We use a sample of 176 countries and 45 years and a battery of control variables to 
examine the role of genetic distance in establishing RTAs across countries. Our results 
show that genetic distance has a significant negative and economically meaningful 
influence on the probability of forming an RTA, even after controlling for geographic 
distance, linguistic distance, religious distance, and other control variables used in the 
literature. Contrary to other measures of cultural differences, genetic distance is a 
readily available proxy variable for a large set of country pairs, so it can easily be 
included within the set of standard regressors used to model RTAs. 

It goes without saying that our results should not and cannot be construed as to imply 
that countries should not engage in trade negotiations with countries which have a larger 
genetic distance, nor do we argue for a biological determinism of trade policy. Instead, 
insofar as genetic distance proxies cultural differences, our results highlight the potential 
usefulness of heightened awareness of possible misunderstandings which may arise during 
trade negotiations due to cultural differences. 

We contribute to the literature which has documented the effect of cultural differences 
proxied by genetic distance on economic outcomes. The seminal contribution is Spolaore 
and Wacziarg (2009) who show that genetic distance between countries can explain 
cross-country differences in income per capita. They focus on a population’s cultural 
distance to the population which represents the current technological frontier, which 
they proxy by the US. The larger these cultural differences, proxied by genetic distance, 


the more difficult the diffusion and adaptation of the frontier technology. In our paper, 
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we focus on genetic distance between any two countries, not to the US, as trade 
negotiations typically only occur between potential member states of a trade agreement, 
and do not typically involve the US. 

Our interpretation of genetic distance as a proxy for unobserved barriers to economic 
integration such as cultural heterogeneity is in line with a broader literature which links 
ethnic diversity measured by genetic distance and cultural heterogeneity.’ Desmet et al. 
(2011) and document that genetic distance allows to improve predictions of similarity of 
individuals’ survey responses in comparison when only using geographic and linguistic 
information. They find that genetic distance predicts the dissolution of deeper 
agreements such as the endogenous formation of nation states from culturally diverse 
regions. We find that genetic distance affects economic integration which falls short of 
creating a joint nation state. RTAs can create aggregate welfare gains when signatory 
parties act cooperatively. To establish an RTA and reap its welfare gains, signatories 
must overcome differences in norms and preferences as well as coordinate differences in 
socio-economic policies. 

Guiso et al. (2009) also use genetic distance as a proxy for cultural difference. They show 
that respondents in the Eurobarometer survey trust individuals less who have a higher 
genetic distance. This lower trust at the individual level is correlated with lower trade 


and portfolio investment between countries with larger genetic distance. Bove and 


7 Ahlerup and Olsson (2012) provide a recent overview of this literature; see also Ashraf and Galor (2013). 
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Gokmen (2018) replicate Spolaore and Wacziarg (2009) and show that the impact of 
genetic distance on income differences between countries is stable over time. We find 
that genetic distance has a stable and significant impact on RTA formation over more 
than four decades. Davies and Guillin (2014) use genetic distance as a proxy for 
communication barriers and find that US outbound services FDI is correlated between 
countries with low genetic distance. Leblang (2010) does not find a significant effect of 
genetic distance on bilateral FDI and portfolio investment in a single cross-section of 
countries. Finally, Chaudhry and Ikram (2015) find that long-run GDP growth is 
correlated between countries with lower genetic distance. 

Our paper also contributes to the literature on the determinants of RTAs, see, e.g., 
Magee (2003), Baier and Bergstrand (2004), Chen and Joshi (2010), and Egger et al. 
(2011).° None of these papers studies the impact of genetic distance.” 

We also, for the first time, point out the importance of considering the correlation in the 
error structure for correct inference when estimating regressions which seek to identify 
the determinants of RTAs. By construction, unobserved country-specific factors which 
SFDI data are often missing for many country pairs, restricting Leblang’s (2010) analysis to 28 FDI- 
receiving countries. Our sample comprises more countries and over 40 years. 

° All the cited papers use probit models in their analysis. Besides probit models, a plethora of methods 
have been used to analyse the determinants of RTAs: Egger and Larch (2008) use spatial econometric 
probit models and Marquez-Ramos et al. (2011) use ordered probit models to explain the drivers of 
different levels of trade integration between countries. Kohl and Brouwer (2014) use a clustering algorithm 
to identify “natural” trade integration blocs and estimate the impact of determinants of these blocs using 
a probit model. 

10 The single exception is Martin et al. (2012) who use genetic distance in one specification for a cross- 


sectional regression for the year 2000. Using panel data, we can analyse the impact of genetic distance 
while controlling for time-varying country-specific unobserved drivers of trade policy. 


determine whether countries sign an RTA are correlated across country pairs. We 
propose using two-way clustered standard errors by Cameron et al. (2011) as an easy 
solution to this problem. The literature on RTA determinants has neglected this 
correlation so far and hence overstates the precision of estimated coefficients." 

The remainder of the paper is organized as follows: Section 2 describes our data. Section 
3 describes our empirical strategy and main results. Section 4 discusses several robustness 


checks. Section 5 concludes. 


2 Data 


RTA and genetic distance 


Our dependent variable is RTA,,,, a binary variable which takes the value 1 if there is 


ijt? 
a customs union or free trade agreement between two countries, and 0 otherwise.!? We 
use a panel from 1970 to 2014, purely driven by data restrictions.’ 

We use the genetic distance measure between populations of countries introduced by 


Spolaore and Wacziarg (2009). Genetic distance measures rely on the fact that during 


" Baier and Bergstrand (2004) discuss correlation of errors across countries within an RTA (e.g., across 
EU member countries) but do not consider the more general case of correlation of a given country’s trade 
policy across all its potential partner countries which we consider. The potential correlation within an 
RTA of Baier and Bergstrand (2004) is modelled on the value of the dependent variable, potentially 
introducing endogeneity bias in the calculation of the standard errors. Our approach avoids this. 

12 We use Mario Larch’s Regional Trade Agreements Database from Egger and Larch (2008): 
http://www.ewf.uni-bayreuth.de/en/research/RTA-data/index.html 

!3 Variables including GDP and polity factors are not available for many countries before 1970. 

1 The remaining paragraph is a succinct summary of Spolaore and Wacziarg (2009). We use the updated 
genetic distance data from Spolaore and Wacziarg (2018): 
https: //sites.tufts.edu/enricospolaore/category /personal-webpage/ 
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human evolution, random variations in the form of genes (so-called alleles) occur over 
time. Geneticists use the difference in the frequency of alleles to measure genetic distance 
between populations. It is important to stress that these measures only focus on random 
drift variation in genes, i.e., neutral variations which do not give any discernible 
advantage for evolutionary selection. Geneticists can use these variations to calculate 
the proximate time elapsed since two populations became separated and hence the 
number of genealogical steps one must take to reach the last common ancestor 
population. Spolaore and Wacziarg (2009) use F'y,, a measure of genetic distance, for 
42 ethnic groups by Cavalli-Sforza et al. (1994). For is a normalized difference in allele 
frequencies in two populations: the larger Fop, the more different the distribution of 
alleles, and hence the more generations one has to go back in time to reach the last 
common ancestor, and hence the larger the genetic distance. As countries typically are 
populated by multiple ethnic groups, Spolaore and Wacziarg (2009) combine the genetic 
distances with country-level ethnic data from Alesina et al. (2003) to measure the genetic 
distances between countries, weighted by the ethnic composition of countries’ 
populations. When country 7 consists of K ethnic groups and country j consists of M 
ethnic groups, genetic distance between 7 and jis calculated as: 


do (six X $4 X dy), (1) 


1 l=1 


Kh 
Genetic Distance;; — 

k= 
where s;, is the share of ethnic group k in country i, s; is the share of ethnic group lin 


country j and d,, is the For genetic distance between ethnic groups kand l. It can be 
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interpreted as the expected genetic distance between two individuals picked at random 
from countries i and j and therefore is a measure of the average genetic distance between 
two countries. 

Control variables 

Standard gravity-type regressors have been shown to be important drivers of RTA 
formation. Time-invariant variables such as geographic distance between countries, 
territorial contiguity and colonizer-colony relationship are collected from Centre 
d’Etudes Prospectives et d’Informations Internationales (CEPI), see Mayer and Zignago 
(2011). Population and GDP data are from the World Development Indicators from the 
World Bank. Following Egger et al. (2011), we use the absolute difference in GDP per 
capita to proxy endowment differences such as the difference in the capital labor ratio 
which is highly correlated with GDP per capita. This measure controls for Heckscher- 
Ohlin-type arguments which may influence the formation of trade agreements between 
countries with different endowments. 

Spolaore and Wacziarg (2016)b show that genetic distance is correlated with measures 
of linguistic and religious distance between countries. One reason for this may be that 
genetic distance captures differences in language and religion due to differences in the 
composition of countries’ populations which are not captured by simple country-pair 
dummy variables like common language typically used in empirical international trade. 
For their linguistic distance measure, Spolaore and Wacziarg (2016)b use classifications 


of languages into language trees which count the number of common nodes in such a 
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language tree. For example, both French and Italian are part of the Indo-European - 
Italic — Romance - Italo-Western branch of languages, i.e., they share four common 
nodes. Similar to genetic distance, these linguistic distances can be weighted with the 
respective population share of a language in a given country. Similar measures can be 
constructed between religions. For example, Christianity, Islam, and Judaism can be 
classified as “Near-Eastern Monotheistic Religions”. Again, these religious distance 
measures can then be weighted according to the share of a religion within a given 
country. To control for the effects of linguistic and religious similarity, we therefore 
also control for both linguistic and religious distance." 

Differences in countries’ political systems have been shown to be important drivers of 
the timing of RTA formation, see Bergstrand et al. (2016). We therefore include the 
difference in the political freedom between countries į and j at time t (Dif Polity) using 
the political freedom index by Marshall et al. (2016). We also use their indices to include 
measures of the difference in political regimes (democracy and autocracy scores, Dif 


Democracy and Dif Auto, respectively), the difference in party competition in 


For further details on the calculation of these measures, see Appendix A.2. Spolaore and Wacziarg 
(2016)b also show that genetic distance is correlated with a cultural difference measure based on question- 
specific distances from the World Valued Survey (WVS) for 98 questions. Contrary to genetic distance 
which is available for 180 countries, this measure is only available for 74 countries (70 in our sample). 
Also, as we interpret genetic distance as a proxy for cultural differences, we do not use this cultural 
difference measure to avoid multicollinearity issues. 

16 In unreported regressions, we used the standard common language dummy instead of religious and 


linguistic distance. Results remain unchanged. 
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parliament (Dif Parcompi), the difference in regulation of political participation (DIF 
Parregi) and in the political competition in government (Dif Polcompy).\" 

Giuliano et al. (2014) argue that in addition to geographic distance, geographic features 
such as terrain ruggedness which determined transportation costs in the distant past 
have also led to separations of populations and hence to genetic distance. At the same 
time, the probability of forming a trade agreement might still be lower due to high 
transportation costs. Our country fixed effects control for the overall ruggedness of a 
country’s terrain but do not control for the bilateral transportation cost caused by 
ruggedness. We therefore interact the origin and destination countries’ ruggedness to 
proxy for these historic transportation costs. We use the ruggedness measure by Nunn 


and Puga (2012). 


3 Empirical specification and results 


We follow Chen and Joshi (2010) and estimate a linear probability model of RTA 
formation. Linear probability models are preferable to limited dependent variable models 
as they are easy to interpret and do not suffer from downward biased coefficient 


estimates in the presence of uncorrelated unobserved heterogeneity.'* They also allow us 


1T We describe the construction of the variables used in 

Table A5. Variable Definitions in the Appendix. 

18 Logit and probit model coefficients are biased if there are omitted variables determining RTA formation, 
even if these are uncorrelated with the regressors, see Mood (2010). Linear probability models do not 
suffer from this problem. They also allow us to include more than 15000 dummies (30800 country pairs in 


45 years), circumventing the large computational burden of non-linear probit models with a high number 
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to control for time-varying unobserved variables for each origin and destination country 
by including country-specific dummy variables. Given the dyadic nature of the data set, 
we expect correlation in the error term between all observations involving country 7 or 
j as a country’s general attitude towards trade policy and RTAs and other country- 
specific unobserved factors may drive the overall willingness of a country to sign RTAs 
with all bilateral partners. This is corroborated by the large degree of correlation for a 
given exporter i and a given importer 7 of trade flows, see, e.g., Spolaore and Wacziarg 
(2009) and Egger and Tarlea (2015). We therefore use two-way clustered standard errors 


using the method from Cameron et al. (2011). We estimate the following model: 


RTA, ;, = 8,In(Genetic Distance);; + Bgln(Geographic Distance), 


; 2 
+ Xib + Mig + Nye + Eijt 2) 


where RTA;,, is a binary variable indicating if there is a regional trade agreement 
between country iô and country j in year t. In(Genetic Distance);; and 
In(Geographic Distance);; are the logarithm of genetic and geographic distance, 


respectively. Xs it includes bilateral control variables which may be correlated with 


genetic distance. p; and nj represent country-year fixed effects that control for 


of dummies. We use the Stata command reghdfe by Guimaraes and Portugal (2010) which allows efficient 
estimation of linear regression models with high-dimensional fixed effects. However, our main results are 
robust to using a Probit model, see Table A1 in the appendix. 

19 Two-way clustering also controls for the fact that Eijt = Exit» V i, j, in our application, as RTA 
RTA 


than one-way clustering at the country-pair level. Using the latter would lead to too small standard errors 


ijt — 


jin V i, j, see Section A.1 in the Appendix. Also note that two-way clustering is strictly more general 


in the presence of two-way clustering, see Cameron et al. (2011). 
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unobserved country-level determinants of RTAs which may vary over time, effectively 
controlling for overall changes in countries’ trade policy as well as country-specific 
business cycle effects which may trigger RTA negotiations.”” The country-year fixed 
effects also control for the interdependence of trade policy decisions as a country’s 
willingness to sign an RTA with another country depends on the number of RTAs it has 
already signed with other countries. Baier et al. (2014) measure this interdependence of 
trade policy using so-called “multilateral FTA terms” which measure the number of 
RTAs country 7 has signed with another country k # j at time t. We capture these terms 
by the u; and n; fixed effects.” We start our sample in 1970 to avoid our fixed effects 
to perfectly predict the variation of RTA;,,.” 


Table 1 Inserts Here 


Table 1 reports the estimates of Equation (2). In column (1), we include (log) genetic 
distance as well as country-year fixed effects, but no controls. Genetic distance has a 
significant negative impact on RTA formation. As genetic distance is highly correlated 
with standard regressors used in the literature (bilateral geographic distance, colonial 


relationship, and contiguity), we explore whether this result holds up. In column (2), we 


a ej = €;;,V i,j also implies that jj, =n, V i and hence only one set of country-specific dummy 
variables are needed, not the origin and destination-specific dummies as used, e.g., in trade gravity models. 
>! Baier et al. (2014) approximate these multilateral resistance terms by GDP-weighted averages of 
bilateral distances with trade partners. These terms also control for a country’s remoteness, i.e., for its 
average trade costs across all its trade partners, similar to the approximation proposed by Baier and 
Bergstrand (2009) in a trade gravity context. Our fixed effects control for these terms, circumventing the 
need to construct proxy indices. 

2 The number of observations is only 40 in 1960 when including all variables. It increases to 1274 in 1970. 
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only include (log) geographic distance, and, as expected, we find a significant negative 
effect of geographic distance on RTA formation, of a similar magnitude of the effect we 
found for genetic distance. The size of the coefficient is also in the same ballpark as 
results, e.g., by Bergstrand et al. (2016). In column (3), we include both distance 
measures simultaneously. Both genetic and geographic distance have a significant and 
negative impact on RTA formation: if genetic distance between two countries increases 
by one percent, the probability of an RTA between them decreases by 0.06 percentage 
points”, whereas the same increase in geographic distance decreases the probability of 
an RTA by 0.10 percentage points. This effect of genetic distance is economically 
meaningful given that the mean of RTA;,, across all years is 0.0657. Then, the 
probability of an RTA decreases by (0.06/100)/0.0657 = 0.009, i.e., about one percent. 
We can compare this to the effect of geographic distance: if geographic distance increases 
by one percent, the probability of an RTA decreases by (0.104/100)/0.0657 = 1.6%. 
Hence, genetic distance has a dampening effect on RTA formation of about half the 
magnitude of the commonly accepted effect of geographic distance. The effect of genetic 
distance remains stable when including the measures of linguistic and religious distance 
in column (4) which are correlated with genetic distance. Bergstrand et al. (2016) stress 
the importance of political factors for RTA formation. In column (5), we follow their 


strategy and include several measures for the difference in the political systems of the 


3 If genetic distance increases by one percent, the probability for an RTA increases by 6, /100 units, 


i.e., f x 100 = 6, = —0.06 percentage points. 
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two countries. The coefficient of genetic distance remains basically unchanged. Finally, 
in column (6), we follow Baier and Bergstrand (2004), Egger et al. (2011) and Baier et 
al. (2014) and include a measure for market size, the sum of both countries’ GDP, 


SUM GDP, 


ijt aS well as proxies for differences in endowments, the difference in the 


levels of GDP and GDP per capita, DIF GDP; and DIF GDP Per Capita,,,. These 


ijt 
proxy for motives for RTAs along the arguments for trade in monopolistic competition 
and Heckscher-Ohlin-type models, respectively. Our estimated coefficient of genetic 
distance remains nearly unchanged. 

Summing up, genetic distance reduces the probability of RTA formation in a large panel 
of countries, even when controlling for a wide variety of variables typically used in the 
literature. It therefore seems to be a simple and readily available catch-all proxy for 
coordination costs arising from cultural differences which have a negative impact on 
RTA formation. 

Geographic distance has been shown to have a stable negative impact on trade flows 
over time, see Disdier and Head (2008). As trade flows and RTA formation are driven 


by common factors, it seems natural to explore whether genetic and geographic distance 


have a constant effect on RTA formation over time or whether there are trends in their 


24 The persistent negative effect of distance on bilateral trade flows has been referred to as the distance 
puzzle. It has spurred a large literature which tries to explain this fact, e.g., Lin and Sim (2012), Yotov 
(2012), and Larch et al. (2016). None of these papers investigates the impact of genetic distance over 
time on bilateral trade flows. 
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effects. To do so, we estimate a series of cross-sectional regressions for every year t in 
our sample using the following specification: 


RTA,; = 6,In(Genetic Distance) ;; + B.ln(Geographic Distance); 


+X B + w+ 1; + Eig 


Table 2 Inserts Here 


H; and n; are country fixed effects for the paired countries. Table 2 presents the 
regression results for selected years. Results are similar to our panel regressions: genetic 
distance negatively affects RTA formation in all columns except for the regression using 
data from 1970 in column (1). For the remaining years the effect remains constant. 
Geographic distance has a negative effect on RTA formation which increases for the 
years 2005, 2010, and 2014. The influence of both religious and linguistic distance, 
conditional on genetic distance and geographic distance is mostly not significant, in line 
with our panel results. Colonial status loses its significance over time, in line with the 
deterioration in trade between former colonies over time as documented by Head et al. 
(2010). This effect seems to spill over into RTA formation as well. The measures for the 
differences in political systems have a significant impact on RTA formation, but not 
consistently over time. Similarly, the measures for market size and endowments, as 


predicted by theories of economic drivers of RTA formation such as Baier and 
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Bergstrand (2004), do not consistently explain RTA formation, except for the difference 
in GDP per capita.” 
In Table 2, the number of observations increases over time which may drive results. We 
therefore re-estimate Table 2 using only the observations in 1970 throughout the whole 
sample period. We present these results in Table A2 in the Appendix. Results remain 
similar. Importantly, genetic distance continues to exert a negative impact on RTA 
formation. 
In addition to the years presented in Table 2, we estimate Equation (3) for all years in 
our sample beginning in 1970. We plot the estimated coefficients for each year for both 
(log) genetic and (log) geographic distance in Figure 1. 

Figure 1 Inserts Here 
Both geographic and genetic distance have a persistent and negative impact on RTA 
formation, except for 1970, probably due to the relatively small number of RTAs in 
1970, see Figure 2. Interestingly, until 1990, we cannot reject the null hypothesis that 
the impact of genetic distance on RTA formation is as large as the impact of geographic 
distance, the measure of distance typically used in the literature. With the end of the 


Cold War, the negative impact of geographic distance becomes stronger. Genetic 


25 We also estimate a probit model equivalent to the linear probability model from Equation (3). We 
report results in Table 3. in the Appendix. Estimates are qualitatively similar. Note that the difference 
in sample size compared to Table 2 stems from the fact that many observations must be dropped to avoid 
a perfect predictor problem for the probit estimator. The number of observations increases over time 
which may drive results. We therefore re-estimate Table 2 using only the observations in 1970 throughout 
the whole sample period. We present these results in Table A2 in the Appendix. Results remain similar. 
Importantly, genetic distance continues to exert a negative impact on RTA formation. 
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distance also matters more, but to a lesser extent. Overall, cultural differences and 
communication or negotiation costs proxied by genetic distance seem to act as a 


significant and economically important barrier to RTA formation. 


4 Robustness Checks 


In the following, we probe our results for robustness across a battery of specifications, 
subsamples, and different potential omitted variables. We present results of these 
robustness checks for our panel regressions in Table 3. For convenience, column (1) 
reproduces column (6) of Table 1, our most stringent specification so far. We have used 
the logarithm of genetic distance to mimic the specification typically used in the 
literature for geographic distance. In our data, 30 country pairs have a genetic distance 
of 0 and hence these observations are excluded from the sample when taking the log of 
genetic distance.” We therefore include the level of genetic distance in column (2) to 
include these observations. We again find a significant and negative impact of genetic 
distance on RTA formation. 

Differences in legal origin of countries reduce the amount of trade between countries, 


see, e.g., Felbermayr and Toubal (2010). Trade agreements may therefore be particularly 


6 There are 15400 distinct country pairs (176 countries) in our sample and 30 pairs are with zero genetic 
distance. Among those, 6 are between European countries: Belgium, Iceland, Ireland, and the Netherlands, 
i.e., 4x3/2 distinct pairs). Observations with zero genetic distance account for 0.2% of all observations in 
the panel for the world sample and 18% of all observations in the Europe 22 sample we use in columns 
(5) and (6). 
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important for countries with different legal origins to overcome these additional trade 
costs, increasing the likelihood of an RTA. At the same time, trade negotiations may be 
particularly difficult between countries with different legal systems.” Baier and 
Bergstrand (2004) do not find evidence that common legal origin matters for RTA 
formation, but it could be that our genetic distance measure picks up this variation and 
leads us to attribute the effect of difference in legal systems to genetic distance. We 
therefore use the La Porta et al. (1998) measure of legal origin and define a binary 
variable Common Legal Origin,; which is 1 if countries i and j share the same legal 
origin, and 0 otherwise. A drawback of this measure is that it is only available for 49 
countries, reducing our sample considerably. Genetic distance, both in logs and levels, 
still has a significant negative effect on RTA formation, see columns (3) and (4). 

Giuliano et al. (2014) argue that geographic features which determined transportation 
costs in the distant past have also led to separations of populations and hence to genetic 
distance. Indeed, geographic distance highly correlates with genetic distance.” In their 
analysis of 22 European countries, they find that genetic distance does not exert a 


significant effect on trade flows once one controls for geographic distance. Genetic 


7 During the stalled negotiations for a potential trade agreement between the European Union and the 
United States, a commonly repeated argument was that differences in legal philosophies in consumer 
protection law (precautionary principle in the EU versus risk assessment and cost-benefit principles in the 
US) made an agreement difficult to reach, see Bergkamp and Kogan (2013). 

238 The correlation between genetic distance and geographic distance in levels across all years is 0.542 in 
our sample. In logarithms, their correlation is 0.600. 
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distance is lower within Europe than in our worldwide sample of 176 countries.” In 
column (5), we use the same 22 countries as Giuliano et al. (2014). Our results are 
consistent with their results for trade flows: genetic distance does not affect RTA 
formation within Europe. This is also confirmed when controlling for common legal 
origin, see column (6). Hence our results indicate that genetic distance is important for 
RTA formation between countries with higher genetic distance. 

Our sample from 1970 to 2014 includes the end of the Cold War and the collapse of the 
Soviet Union (USSR). These events have significantly changed the geopolitical 
environment in which trade agreement negotiations take place: Gowa and Mansfield 
(1993) argue that this shift from a bipolar to a multipolar world affects the formation 
(and dissolution) of trade agreements. This shift is also clearly visible in the number of 
RTAs in place which has picked up after the dissolution of the USSR in 1991, see Figure 
2. Relatedly, the countries which emerged from the former USSR increase our sample 
and results may be driven by these new countries. We therefore rerun our regressions 
after excluding all former Soviet Union countries.*’ Accordingly, Figure Al in the 
Appendix redraws Figure 1. Still, genetic distance remains significant throughout the 
sample period, with the exception from 1970 to 1972. For most of the sample period, we 


cannot reject the null hypothesis that genetic distance has a similar impact as geographic 


” The average genetic distance between the countries in Giuliano et al. (2014)’s sample is 0.00502 (sd. 
0.00421), whereas in our full sample it is 0.03692 (sd. 0.01854). See Table A8 in the Appendix. 
30 The list of excluded countries is presented in Table A9 in the Appendix. 
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distance on RTA formation, strengthening our baseline results. Only for the last ten 
years of our sample, geographic distance has a larger effect than genetic distance, but 
genetic distance remains exerting a significant negative impact on RTA formation. Table 
A3 presents the cross-sectional estimates underlying Figure A1, i.e., re-estimates Table 
2 on a smaller sample by excluding former USSR countries. Results remain similar. 

A history of military conflicts can motivate countries to deepen trade integration 
between them, an argument particularly applied to the European integration process, 
see Martin et al. (2012). Also, military conflicts lead to lower levels of trust between 
countries, negatively affecting trade, see Guiso et al. (2009). At the same time, countries 
with lower genetic distance have a higher likelihood to engage in wars as they share 
similar preferences and compete for similar rival goods, see Spolaore and Wacziarg 
(2016)a, leading to a potential omitted variable bias. We measure the experience of 
conflict and war related events by a series of variables*!: the total duration of previous 
wars between two countries, measured in days, (WAR Duration,,;),” the recentness of 


the latest war between the two countries, (WAR Recentness; 


ijt), measured by a dummy 


variable which is 1 when the two countries have experienced wars in the last 20 years 
and 0 otherwise, the frequency of wars between 1870 and 1945 


(WAR Freq (pre 1945),;), the existence of a military alliance between the two countries 


31 We describe the construction of the variables used in Table A5 in the Appendix. 
3? Data are from Kreutz (2010). 
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( Military Alliance Relationship;; )*, and the bilateral correlation in UN votes 
(UN Vote Correlation,,)* as a measure of implicit political alliance. We present results 
of these regressions in Table A4. Note that due to data availability, these regressions are 
run on smaller samples than our baseline results. Geopolitical motives are drivers behind 
RTA formation, as both military alliance and UN vote correlation have significant and 
positive effects for most years. Interestingly, a common history of war between countries 
does not have a stable impact. Importantly, higher genetic distance remains to have a 
significant and negative impact on RTA formation. It appears as an economically 
important driver of the formation of regional trade agreements of similar magnitude 
as the commonly used geographic distance. It therefore should be included as a simple 
proxy variable for difficult to observe cultural differences and communication costs in 


studies of RTA formation. 


5 Conclusion 


Negotiations of trade agreements are often time-consuming and do not always reach 
an agreement. Understanding the determinants of successfully concluded regional trade 
agreements can help to identify drivers and potential pitfalls for future trade 


agreements. This paper examines the role of genetic distance between the populations 


3 Data of military conflicts and military alliance are taken from the Correlates of War project 
http://www.correlatesofwar.org/, see Gibler (2009) and Maoz et al. (2018). 
** Data are from Voeten et al. (2009). 
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of countries on RTA formation. Genetic distance measures how genetically related to 


populations are in terms of their last common ancestor. It is a readily available proxy 


for communication costs arising from differences in culture and preferences. Trade 


negotiations are particularly affected by these costs as they involve infrequent, high 


stakes interactions between often changing high-level politicians or bureaucrats from 


different cultural backgrounds where establishing trust and a common understanding 


may be difficult. We find that country pairs with larger genetic distances between their 


populations have a lower probability of signing an RTA. This effect is stable over time, 


has increased in its importance since the end of the Cold War and is distinct from the 


impact of geographic distance on RTA formation. It is robust to controlling for other 


determinants of RTA formation typically used in the literature and holds across 


different subsamples. Our results are consistent with a larger literature which 


documents the impact of cultural differences proxied by genetic distance on economic 


outcomes. Our results should not be interpreted as evidence for genetic determinism 


of trade policy. Instead, our results document the usefulness of genetic distance as a 


readily available proxy for difficult to measure bilateral communication and 


negotiation costs due to cultural differences across countries. Trade policy makers who 


want to engage in trade negotiations should be aware of these differences to avoid the 


premature failure of negotiations of a mutually beneficial trade agreement. 
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Figure 1. Coefficients of In(Genetic Distance);; and In(Geographic Distance),;; 
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Notes: Coefficients from cross-sectional OLS regressions based on Equation (3). Red and blue lines are the 
coefficients of genetic and geographic distance, respectively. The grey areas are the 95% critical interval 
for coefficients (1.96 times the standard error of the estimated regression coefficient). 
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Figure 2. Total Number of Country Pairs with RT As, 1950-2014 
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Note: Graph depicts the total number of country pairs which are covered by an RTA (free trade agreement 
and/or customs union). Number of countries is N = 176, hence the total number of country pairs is 


N x (N — 1) = 30800. 
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Table 1. OLS Panel Regression Coefficient Estimates (1970-2014) 


(1) (2) (3) (4) (5) (6) 
-0.115*** -0.060*** -0.063*** -0.066*** -0.063*** 
Prieenetie Distance); 0.115 0.060 0.063 0.066 0.063 
(0.008) (0.008) (0.010) (0.010) (0.010) 
-0.137*** -0.104*** -0.105*** -0.116*** -0.137*** 
Ln(Geographic Distance);; Gist p10 O08 ae oot 
(0.009) (0.008) (0.012) (0.014) (0.016) 
-0.01 01 i 
Ln(Religious Distance) ;; oe ie 002a 
(0.021) (0.021) (0.024) 
Ln(Linguistic Distance);; G03 p08 a2 
(0.011) (0.011) (0.014) 
-0.014 -0. 
Colonial Relationship, ; uae nee 
(0.031) (0.030) 
, 0.097*** 0.095*** 
Contiguous;, 
(0.029) (0.030) 
DIF Democracy; jt is 0:005 
(0.003) (0.003) 
xk * 
DIF Autos 0.010 0.008 
(0.005) (0.005) 
Z xk È 
DIF Polity 0.008 0.005 
(0.004) (0.004) 
DIF Parregij -0.002 0.001 
(0.004) (0.004) 
-0.016** -0.014** 
DIF Parcomp,;+ 
(0.007) (0.007) 
0.011*** 0.010*** 
DIF Polcomp;;, 
(0.003) (0.003) 
-0.001 
Ruggedness, x Ruggedness , 
14 $ a (0.002) 
DIF GDP; -0.001 
(0.003) 
SUM GDP x -0.013 
(0.010) 
-0.013*** 
DIF GDP Per Capita; i sage 
(0.004) 
Country x Year Fixed Ef fects Yes Yes Yes Yes Yes Yes 
N 1383300 1336320 1336320 467550 367318 310692 
adj. R? 0.303 0.337 0.356 0.348 0.364 0.393 


Notes: Two-way cluster-robust standard errors clustered at the origin and destination country in 


parenthesis. * p < 0.1, ** p < 0.05, *** p < 0.01. Regressions are based on Equation (2). 
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Table 2. Cross-section OLS Coefficient Estimates for Specific Years 


(1) (2) (3) (4) (5) (6) (7) 


1970 1975 1985 1990 2005 2010 2014 
Ln(Genetic Distance) 0.009 -0.065*** -0.045*** -0.037** -0.083*** -0.075*** -0.067*** 
(0.009) (0.018) (0.017) (0.017) (0.016) (0.017) (0.019) 
Ln(Geographic Distance),, -0.072*** -0.110*** -0.087*** -0.088*** -0.182*** -0.199*** -O.171*** 
(0.016) (0.021) (0.019) (0.019) (0.021) (0.021) (0.025) 

Ln(Religious Distance), 0.04 -0.006 0.002 0.012 0.058* 0.027 -0.096* 
(0.036) (0.034) (0.036) (0.034) (0.032) (0.039) (0.055) 

Ln(Linguistic Distance) 0.000 0.035*** 0.036*** -0.007 0.004 0.011 0.021 
(0.011) (0.013) (0.013) (0.026) (0.023) (0.013) (0.020) 
Coona Relek -0.003** 0.006 0.004 0.004 -0.005** -0.004** -0.005** 
(0.001) (0.004) (0.003) (0.003) (0.002) (0.002) (0.002) 

Cut pibues 0.156 -0.069** -0.049*** -0.075*** 0.019 0.036 0.058 
E (0.101) (0.030) (0.013) (0.015) (0.043) (0.044) (0.055) 

DIE Demeenien: 0.063 0.083* 0.022 0.066 0.117** 0.126*** 0.117** 
a (0.045) (0.044) (0.037) (0.043) (0.045) (0.045) (0.047) 
DIF Auto, -0.002 0.002 0.013** -0.007 -0.004 -0.009 -0.017** 
(0.003) (0.005) (0.006) (0.005) (0.006) (0.006) (0.008) 

DIF Polity; 0.000 0.005 0.017** 0.012* 0.005 -0.004 -0.003 
(0.004) (0.006) (0.007) (0.006) (0.006) (0.008) (0.008) 

DIF Parregis, 0.004 -0.006 -0.018*** -0.005 -0.003 -0.001 0.004 
(0.003) (0.005) (0.006) (0.005) (0.005) (0.005) (0.007) 

DIF Parcomp;; -0.012** -0.002 -0.006 -0.007 0.012* 0.011 0.017* 
(0.006) (0.007) (0.005) (0.005) (0.007) (0.007) (0.009) 

DIF Polcomp jy 0.025** -0.001 -0.043*** -0.009 -0.018* -0.004 0.007 
(0.010) (0.014) (0.015) (0.012) (0.011) (0.010) (0.012) 

Ruggedness, x Ruggedness; -0.010** 0.001 0.019*** 0.011** 0.005 0.007 -0.004 
(0.005) (0.006) (0.006) (0.005) (0.005) (0.005) (0.006) 

DIF GDP,» -0.006 0.000 -0.002 0.001 -0.003 -0.002 0.000 
(0.005) (0.005) (0.004) (0.004) (0.005) (0.004) (0.005) 

0.031** -0.006 -0.014 -0.016 -0.026 -0.01 -0.008 

PPN Sei (0.014) (0.014) (0.012) (0.012) (0.016) (0.013) (0.015) 
DIF GDP Per Capa: -0.008* -0.011** -0.012*** -0.018*** -0.017*** -0.014** -0.009 
ie (0.004) (0.004) (0.004) (0.005) (0.006) (0.006) (0.007) 

Country x Year Fixed Ef fects Yes Yes Yes Yes Yes Yes Yes 

N 3883 4337 5636 6179 8863 8997 8735 
adj. R? 0.227 0.383 0.365 0.362 0.441 0.409 0.431 


Notes: Two-way cluster-robust standard errors in brackets. * p < 0.1, ** p < 0.05, *** p < 0.01. 
Regressions are based on Equation (3). 
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Table 3. OLS Panel Regression Coefficient Estimates (1970-2014): Robustness Checks 


(1) (2) (3) (4) (5) (6) 
All All All All Europe 22 Europe 22 
ineen Daia -0.063*** -0.044*** -0.006 -0.009 
(0.010) (0.015) (0.006) (0.015) 
Genetic Distance;j e o 
xx xx 2K KK 
Common Legal Origin,, nee nt r 
ineke Dine 0.025 0.027 0.185*** 0.183*** -0.062 -0.061 
(0.024) (0.025) (0.045) (0.048) (0.036) (0.052) 
E tenes 0.02 0.025* 0.008 0.010 0.119 0.328 
(0.014) (0.014) (0.011) (0.011) (0.171) (0.232) 
A A -0.137*** -0.150*** 0.000 -0.262*** 0.003 0.027 
(0.016) (0.016) (0.000) (0.020) (0.018) (0.017) 
E E e A -0.001 -0.001 -0.021 -0.019 -0.007 -0.01 
(0.002) (0.002) (0.013) (0.012) (0.007) (0.012) 
E E TERR -0.006 -0.014 -0.05 -0.080*** 0.027 0.012 
ts (0.030) (0.033) (0.030) (0.029) (0.027) (0.088) 
Cee 0.095*** 0.101*** -0.133* -0.142* 0.027 0.005 
= (0.030) (0.031) (0.076) (0.072) (0.018) (0.027) 
pin Donita -0.005 -0.004 -0.006 -0.003 -0.043 -0.036 
(0.003) (0.004) (0.009) (0.009) (0.033) (0.045) 
DIE Aan oa oo ee 
DIF Polity., -0.005 -0.007* -0.004 -0.008 0.033*** 0.030*** 
g (0.004) (0.004) (0.008) (0.008) (0.008) (0.008) 
pie Pairis 0.001 0.000 -0.002 -0.005 -0.043*** -0.028 
(0.004) (0.004) (0.012) (0.012) (0.006) (0.086) 
pir Porm oe ee RERE 
DIP pocoo 0.010*** 0.011*** 0.010** 0.012*** -0.012 0.000 
(0.003) (0.003) (0.004) (0.004) (0.032) (0.000) 
-0.001 -0.001 0.001 -0.001 0.005 0.008 
ERRER (0.003) (0.003) (0.005) (0.006) (0.009) (0.018) 
SUM GDP,» -0.013 -0.016 0.038 0.044 -0.050* -0.076 
(0.010) (0.011) (0.028) (0.029) (0.028) (0.064) 
DIF GDP Per Copia -0.013*** -0.015*** -0.006 -0.009 -0.002 -0.011 
(0.004) (0.004) (0.010) (0.010) (0.005) (0.008) 
Country x Year Fixed Ef fects Yes Yes Yes Yes Yes Yes 
N 310692 311463 41300 41480 4688 3285 
adj. R? 0.393 0.389 0.582 0.581 0.816 0.549 


Notes: Two-way cluster-robust standard errors in brackets. * p < 0.1, ** p < 0.05, *** p < 0.01. 


Regressions are based on Equation (3). 
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Appendix 
A.1 On Clustering Standard Errors 


In our regression in both Equation (2) and (3), we use the two-way clustering proposed 
by Cameron, Gelbach and Miller (2011), treating the origin country i and the 
destination country 7 as two groups. 

In the following, we show that two-way clustering also accounts for the fact that ¢,; = 
Ej V i,j, ie, that the residuals for a given country pair is identical for the two 
occurrences of a country pair within a data set. How do we know that the residuals are 
identical? 


Note that RTA, = RTA;;,,V i,j, i.e., the regional trade agreement dummy does not 


git? 
have a direction (this is different to, e.g., trade flows: exports from China to Germany 


are different from exports from Germany to China). Similarly, X;; = V 4,9, i.e., our 


tj ji) 
regressors also do not have a direction. It can be shown that this immediately implies 
that n; = uj, Y i, j.” 


Having established that ¢;; = €}, V i,j, we can proceed to establish that two-way 


ju) 


clustering accounts for this perfect correlation of the residual for the two observations 


3 This is a well-known fact in the gravity literature, see, e.g., Head and Mayer (2014), p. 140: In a bilateral 
gravity equation of symmetric bilateral trade flows regressed on symmetric trade cost measures, it can be 
shown that estimated importer and exporter dummies are identical. The proof applies in our setting as 
we can interpret our dependent variable as a trade flow. This is not model dependent but is simply a fact 
of the properties of OLS. This in turn implies that €,; = €);,V 7,7, i.e., there is a (perfect) correlation 
between the error terms. We therefore cluster our standard errors to account for clustering within the 


country pair. 
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of the country pair. The variance-covariance estimator by Cameron, Gelbach and Miller 
(2011) assumes that 
E(EjjqghElmg’h! Xijghs Xlmg'h’) = 9, unless g = g'or h=h’, 

where ij and Im refer to two country pairs (i.e., observations in the data set) and where 
we now indicate explicitly the two groups (i.e., clusters), in our application the first and 
the second country in a country pair, by g and h. If g = g’ orh=h’, i.e., within an 
origin or destination country, the estimator by Cameron, Gelbach and Miller (2011) 
allows for arbitrary correlation between the errors, including perfect correlation, i.e., 
E =E . We have shown above that ¢,,=¢, V ij . This 


implies E(e 


IX; jgn> Xjig’h’) = E(EijgnEizgh Xijzghs Xijgn) and hence the estimator 


ijgh™ jig! h! 


allows for arbitrary correlation between ¢;,,,, and € including our case of perfect 


jig’ h’> 
correlation. 

It is helpful to illustrate with an example. Imagine we have a data set of RTAs between 
three countries A, B, and C. Hence, we have the following observations in our data set: 
AB, AC, BA, BC, CA, and CB. We have three groups indicated by g, which are the 
three groups where each country is the first country in the country pair, and three groups 
indicated by h, which are the three groups where each country is the second country in 
the country pair. Label the groups in the following way: g = 1 consists of country pairs 
AB and AC, g = 2 consists of BA and BC, and g = 3 of CA and CB. Similarly, label 


h = 1 the country pairs BA and CA, h = 2 consists of AB and CB, and h = 3 of AC 


and BC. Then, E(€4 p12 ga21|Xapiz XBa2i) = E(E€AB12£4B12|X4 B12; X4B12); and hence 
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two-way clustering between origin and destination countries using the estimator of 
Cameron, Gelbach and Miller (2011) allows for the perfect correlation between € 419 


and €p491- 


A.2 On the Calculation of Linguistic and Religious Distance 


Spolaore and Wacziarg (2016)b use data from Fearon (2003) on linguistic trees to 
calculate the distance between languages in a similar way they calculate genetic 
distances. To continue the example from the main text, French is categorized as “Indo- 
European - Italic - Romance - Italo-Western - Western - Gallo-Iberian - Gallo-Romance 
-Gallo-Rhaetian - Oil — Francais”. Similarly, Italian is classified as “Indo-European - 
Italic - Romance - Italo-Western - Italo-Dalmatian”. Therefore, the number of common 
nodes between Italian and French is 4: Indo-European, Italic, Romance, Italo-Western. 
As with genetic distance, we use their weighted distance measure, i.e., the expected or 
weighted number of common nodes: 


C a=), 


(3 


I 
=1 j 


J 

(Six X Si, X Ci;) 
=1 
where Sig is the share of ethnic group k in country i, sj, is the share of ethnic group l 
in country j and cx; is the number of common nodes between languages of ethnic groups 
k and l. CN” ranges from 0 to 15. We also follow the transformation of Spolaore and 
Wacziarg (2016)b and Fearon (2003) to adjust the value of CN;{ to 0 and 1 as 


15 — CN} 
TLD Al ae E, 


where TLD} refers to tree-based linguistic distance. 
The calculation of religious distance by Spolaore and Wacziarg (2016)b also uses a tree- 
based method. The trees are consisting of broad classified religious groups. Then these 


broad group religious are further divided into finer classifications. The number of 
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common nodes measure the similarity of the two religions, analogous to the calculation 
of linguistic distance. 

Compared to using dummies like common official language or common religion, both the 
linguistic and religious distance measures provide more accurate information about the 
corresponding differences between two countries as they take into account the 


composition of the population. 
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A.3 Additional Figures and Tables 


Table Al. Cross-section Probit Coefficient Estimates for Specific Years 


(1) (2) (3) (4) (5) (6) (7) 
1970 1975 1985 1990 2005 2010 2014 
a 0.173 -0.688*** -0.770** -0.645* -0.367*** -0.358***  -0.441*** 
POENE ER Re (0.126) (0.226) (0.355) (0.362) (0.073) (0.058) (0.063) 
r E E T -1.473*** -3.057***  -2,552¥**  -3,003***  -1,783*** -1.363***  -1.,137*** 
(0.174) (0.322) (0.310) (0.320) (0.093) (0.062) (0.062) 
nest Tr -0.424 0.312 0.860* 1.234*** 0.221 -0.447***  -0.854*** 
w (0.319) (0.484) (0.459) (0.382) (0.159) (0.130) (0.118) 
Te ae eee -1.411*** -1.704***  -4.116*** -1.612** -0.121 -0.147 0.173 
K (0.296) (0.408) (1.058) (0.651) (0.106) (0.142) (0.107) 
eonia Relesionshipe 0.865*** -0.442 -0.780* -0.602* 0.702*** 0.469** 0.497*** 
(0.315) (0.441) (0.403) (0.363) (0.261) (0.194) (0.191) 
ee 0.067 0.304 -1.546** -1.222* -0.351 0.239 0.437* 
z (0.394) (0.857) (0.762) (0.633) (0.247) (0.218) (0.230) 
DEDET 0.817** 0.375** -0.141 -0.573***  -0.107*** -0.083*** = -0.112*** 
(0.391) (0.147) (0.546) (0.199) (0.031) (0.031) (0.035) 
DIP AU 0.807** 0.409*** -0.207 -0.144 0.008 -0.069*** -0.005 
(0.397) (0.123) (0.410) (0.125) (0.029) (0.025) (0.025) 
DIF Polity, -0.657* -0.438*** -0.041 0.047 0.004 0.026 0.033 
es (0.391) (0.110) (0.501) (0.136) (0.023) (0.024) (0.027) 
DIF Parreg,, 0.275 1.050*** 1.242*** 0.499***  0.227*** 0.101*** 0.123*** 
ma (0.221) (0.358) (0.256) (0.163) (0.037) (0.031) (0.029) 
DIF Parcomp,, 1.570*** 0.22 -2.470*** -0.423 -0.186** 0.055 0.041 
ak (0.309) (0.323) (0.487) (0.471) (0.084) (0.080) (0.084) 
DIF Polcomp,, -0.860*** -0.301 1.008*** 0.614*** 0.054 0.005 -0.047 
a (0.145) (0.204) (0.191) (0.159) (0.040) (0.037) (0.039) 
E AA E O -0.175** -0.442*** -0.141 -0.037 -0.068** -0.021 -0.022 
' ? (0.081) (0.130) (0.113) (0.128) (0.027) (0.018) (0.019) 
PrOD. -0.198 -0.058 0.056 0.199 0.035 -0.009 -0.008 
g (0.160) (0.127) (0.177) (0.133) (0.044) (0.034) (0.033) 
Sane: 1.425*** -0.488* -0.612* -0.863***  -0,584*** -0.149* -0.074 
m (0.353) (0.290) (0.354) (0.279) (0.114) (0.077) (0.076) 
DIF GDP Per Capita, -0.256*** -0.378** -0.253* -0.314**  -0.138*** -0.082*** -0.035 
(0.086) (0.159) (0.143) (0.136) (0.028) (0.023) (0.024) 
Country Fixed Ef fects Yes Yes Yes Yes Yes Yes Yes 
N 1028 1172 1003 1347 6949 8018 7796 
Pseudo R? 0.5776 0.8143 0.8256 0.8305 0.6462 0.5461 0.5448 


Notes: Robust standard errors in brackets. * p < 0.1, ** p < 0.05, *** p < 0.01. 
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Table A2. Cross-section OLS Coefficient Estimates for Specific Years (Using Country 


Pairs with Data Available for 1970) 


(1) (2) (3) (4) (5) (6) 


(7) 


1970 1975 1985 1990 2005 2010 2014 
Cee Data, 0.009 -0.062***  — -0.065*** = -0.067*** — -0.103***  —_-0.094***  -0.088*** 
(0.009) (0.019) (0.018) (0.018) (0.018) (0.022) (0.028) 

A ae 0.04 -0.012 0.007 0.026 0.074* 0.068 -0.085 
(0.036) (0.039) (0.042) (0.044) (0.038) (0.042) (0.069) 

MEROE Diane): 0.000 0.039*%**  0.045*** 0.001 -0.002 0.007 0.017 
(0.011) (0.013) (0.015) (0.029) (0.025) (0.012) (0.018) 
ta nose Distance -0.072*** = -0.117***  -0.110***  -0.118%**  -0.188***  -0.214***  -0.181*** 
(0.016) (0.022) (0.022) (0.023) (0.026) (0.026) (0.032) 

EE E 0.156 -0.057**  -0.043**  -0.069*** -0.003 0.027 0.024 
2 (0.101) (0.025) (0.016) (0.021) (0.019) (0.024) (0.058) 

Supa 0.063 0.084* 0.033 0.06 0.139** 0.123** 0.139** 
(0.045) (0.045) (0.042) (0.050) (0.056) (0.057) (0.058) 

Vil perosi -0.002 0.000 0.01 -0.014** -0.004 -0.02 0.018 
ae (0.003) (0.006) (0.008) (0.006) (0.008) (0.020) (0.017) 

DIF Auto 0.000 0.002 0.015* 0.006 -0.001 -0.023 0.018 
(0.004) (0.006) (0.008) (0.006) (0.007) (0.019) (0.016) 

DIF Polity,, 0.004 -0.004 -0.016** 0.001 -0.002 0.013 -0.023 
7 (0.003) (0.005) (0.007) (0.005) (0.006) (0.017) (0.016) 

DIF Parreg,, -0.012** -0.001 -0.004 -0.003 0.019* 0.007 0.023 
Y (0.006) (0.007) (0.006) (0.005) (0.009) (0.012) (0.016) 

DIF Parcomp,, 0.025** 0.001 -0.040** -0.001 -0.034** -0.032* 0.000 
w (0.010) (0.015) (0.017) (0.012) (0.015) (0.017) (0.018) 

DIF Poleomp,, -0.010** 0.001 0.018*** 0.012** 0.019** 0.022** -0.005 
1t (0.005) (0.006) (0.006) (0.005) (0.008) (0.009) (0.010) 

A EE T T -0.003** 0.006 0.007 0.007 0.004 0.004 0.004 
(0.001) (0.004) (0.005) (0.004) (0.004) (0.004) (0.006) 

DIF GDP Givi. “ony 200s ORE: “eb ete on 
SUM GDP, 0.031** -0.002 -0.002 -0.005 -0.017 -0.002 -0.007 
(0.014) (0.014) (0.014) (0.014) (0.021) (0.019) (0.021) 

DIF GDP Per Capita., -0.008* -0.011**  -0.015*** — -0.020*** -0.011 -0.006 -0.001 
is (0.004) (0.005) (0.005) (0.007) (0.007) (0.008) (0.010) 

Country Fixed Ef fects Yes Yes Yes Yes Yes Yes Yes 

N 3883 3883 3883 3883 3795 3883 3800 

adj. R? 0.227 0.388 0.427 0.433 0.471 0.422 0.416 


Note: All columns are controlled with the same sample number in 1970. The drop in the number of 
observations in 2005 and 2014 is due to some variables not being available for those years. Regressions 
are based on Equation (3). 
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Figure Al. Coefficients of Ln(Genetic Distance);; and Ln(Geographic Distance);; 


(1970-2014) (Exclude Former USSR Countries) 


0.2 

= Ln(Genetic Distance) sy 

— [n(Geographic Distance) sy 
0.1 


Ln(Genetic Distance) ij 


Coefficient 


in(G Distance) j l 


1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 
Year 


Notes: Coefficients from cross-sectional OLS regressions based on Equation (3) and exclude all the USSR, 
countries. Red and blue lines are the coefficients of genetic and geographic distance, respectively. The grey 
areas are the 95% critical interval for coefficients (1.96 times the standard error of the estimated regression 


coefficient). 
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Table A3. Cross-section OLS Coefficients for Specific Years (Exclude Former USSR 


Countries) 
(1) (2) (3) (4) (5) (6) (7) 
1970 1975 1985 1990 2005 2010 2014 
fe 0.009 -0.065*** = -0.045*** = -0.036** = -0.098*** -0.093*** —-0.084*** 
In(Genetic Distance) ;; 
(0.009) (0.018) (0.017) (0.017) (0.015) (0.016) (0.020) 
-0.072*** --0,110*** --0,087"°" -0.089%** <0.172*** -0.190*** -0.173%*** 
Ln(Geographic Distance);; l ` ` f j ` : 
(0.016) (0.021) (0.019) (0.019) (0.022) (0.022) (0.027) 
he : 0.04 -0.006 0.002 0.012 0.072** 0.033 -0.094 
Ln( Religious Distance);; 
(0.036) (0.034) (0.036) (0.034) (0.032) (0.037) (0.057) 
boca, Betis alee 0.000 0.035*** — 0.036*** -0.007 -0.005 0.005 0.019 
In(Linguistic Distance);; 
(0.011) (0.013) (0.013) (0.026) (0.023) (0.012) (0.019) 
. . : 0.156 -0.069**  -0.049***  -0.076*** -0.039* -0.02 -0.001 
Colonial Relationship,, 
(0.101) (0.030) (0.013) (0.015) (0.022) (0.027) (0.043) 
: 0.063 0.083* 0.022 0.068 0.151*** 0.152*** 0.132*** 
Contiguous; 
(0.045) (0.044) (0.037) (0.044) (0.048) (0.048) (0.050) 
-0.002 0.002 0.013** -0.007 -0.003 -0.002 -0.01 
DIF Democracy;;jt 
(0.003) (0.005) (0.006) (0.005) (0.006) (0.007) (0.008) 
40 * : E 3 
DIF Auto, 0.000 0.005 0.017 0.012 0.001 0.008 0.008 
(0.004) (0.006) (0.007) (0.006) (0.006) (0.008) (0.008) 
k 5 Ek% _ K 
DIF Polity, 0.004 0.006 0.018 0.005 0.000 0.002 0.004 
(0.003) (0.005) (0.006) (0.005) (0.005) (0.006) (0.007) 
DI -0.012** -0.002 -0.006 -0.007 0.014** 0.01 0.019* 
F  Parreg; 5 
(0.006) (0.007) (0.005) (0.005) (0.007) (0.008) (0.010) 
DI 0.025** -0.001 -0.043*** -0.009 -0.012 -0.009 0.003 
F Parcomp; x 
(0.010) (0.014) (0.015) (0.012) (0.011) (0.010) (0.011) 
DI -0.010** 0.001 0.019*** 0.011** 0.002 0.008 -0.003 
F Polcomp;;j 
(0.005) (0.006) (0.006) (0.005) (0.005) (0.006) (0.006) 
-0.003** 0.006 0.004 0.004 -0.003 -0.002 -0.003 
Ruggedness; x Ruggedness; 
(0.001) (0.004) (0.003) (0.003) (0.002) (0.002) (0.002) 
-0.006 0.000 -0.002 0.001 0.000 -0.001 0.001 
DIF GDP; 
(0.005) (0.005) (0.004) (0.004) (0.006) (0.004) (0.005) 
0.031** -0.006 -0.014 -0.017 -0.027* -0.011 -0.01 
SUM GDP, ,, 
(0.014) (0.014) (0.012) (0.012) (0.016) (0.014) (0.016) 
7 * E e kE L EKE =i 4k as * 7 
DIF GDP Per Capita; , 0.008 0.011 0.012 0.018 0.012 0.010 0.008 
(0.004) (0.004) (0.004) (0.005) (0.006) (0.006) (0.008) 
Country Fixed Ef fects Yes Yes Yes Yes Yes Yes Yes 
N 7990 9290 12410 13314 14736 14980 14494 
adj. R? 0.263 0.399 0.374 0.387 0.468 0.42 0.418 
Notes: Two-way cluster-robust standard errors in brackets. * p < 0.1, ** p < 0.05, *** p < 0.01. 


Regressions are based on Equation (3) and exclude all USSR countries. 
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Table A4. Cross-section OLS Coefficients for Specific Years (with War-related 


Variables) 
(1) (2) (3) (4) (5) (6) T) 
1970 1975 1985 1990 2005 2010 2014 
rn 0.008 -0.069*** -0.046*** -0.043*** -0.069*** -0.068*** -0.059*** 
In(Genetic Distance), 
0.008) 0.018) 0.016) 0.016) 0.017) 0.018) 0.020) 
i oa -0.061*** -0.111*** -0.053*** -0.054*** -0.157*** -0.179*** -0.167*** 
Ln(Geographic Distance), 
0.015) 0.022) 0.018) 0.018) 0.023) 0.023) 0.025) 
ad ; 0.063* 0.008 0.042 0.039 0.065** 0.032 -0.098* 
Ln( Religious Distance);, 
0.034) 0.037) 0.031) 0.028) 0.031) 0.039) 0.055) 
SETE wee 0.008 0.035*** 0.054*** 0.005 0.006 0.014 0.015 
Ln(Linguistic Distance), 
0.010) 0.013) 0.019) 0.030) 0.025) 0.015) 0.020) 
; : ; 0.157 -0.068* -0.038 -0.065** 0.028 0.044 0.061 
Colonial Relationship,; 
0.098) 0.037) 0.029) 0.026) 0.045) 0.045) 0.056) 
; 0.064 0.092* 0.038 0.073 0.136*** 0.139*** 0.135*** 
Contiguous;; 
0.047) 0.046) 0.041) 0.045) 0.043) 0.043) 0.047) 
-0.004 0.000 0.004 -0.009** -0.004 -0.008 -0.016* 
DIF Democracy,; 
0.004) 0.005) 0.005) 0.004) 0.005) 0.006) 0.008) 
DIF Auto,, -0.002 0.003 0.009 0.004 -0.001 -0.007 -0.005 
0.004) 0.006) 0.006) 0.005) 0.005) 0.008) 0.009) 
* 3 š * 
DIF Polity; 0.007 0.004 0.009 0.000 0.002 0.001 0.006 
0.004) 0.006) 0.005) 0.004) 0.004) 0.006) 0.007) 
-0.015** -0.003 0.002 -0.001 0.013* 0.011 0.017* 
DIF Parreg;; 
0.006) 0.007) 0.006) 0.004) 0.007) 0.007) 0.009) 
0.037*** 0.004 -0.032** -0.005 -0.01 -0.001 0.009 
DIF Parcomp,; 
0.012) 0.017) 0.014) 0.009) 0.010) 0.010) 0.012) 
£ x% 401K xk : 
DIF Polcomp;; 0.015 0.001 0.017 0.011 0.002 0.005 0.006 
0.006) 0.009) 0.005) 0.004) 0.004) 0.005) 0.006) 
-0.002* 0.006 0.003 0.003 -0.004** -0.004** -0.005** 
Ruggedness, x Ruggedness; 
0.001) 0.004) 0.002) 0.002) 0.002) 0.002) 0.002) 
DIF GDP, -0.005 -0.001 -0.002 -0.002 -0.005 -0.004 0.000 
0.005) 0.006) 0.004) 0.005) 0.005) 0.004) 0.005) 
40K 3 z E 2 
SUM GDP, 0.033 0.001 0.000 0.003 0.02 0.006 0.008 
0.016) 0.014) 0.013) 0.013) 0.016) 0.013) 0.014) 
4 £ 40 à f 40K z * j * > 
DIF GDP Per Capita,, 0.006 0.008 0.005 0.009 0.011 0.011 0.006 
0.004) 0.004) 0.003) 0.004) 0.006) 0.006) 0.007) 
x% x xk z xk : xk 
WAR Duration; 0.004 0.002 0.000 0.000 0.000 0.000 0.000 
0.002) 0.000) 0.000) 0.000) 0.000) 0.000) 0.000) 
WAR Recentness;; 0.000 0.000 0.000 0.000 0.000 0.000 0.000 
0.000) 0.000) 0.000) 0.000) 0.000) 0.000) 0.000) 
40K 401K 40K * 
Millitary Alliance Relationship; 0.087 0.006 0.243 0.201 0.102 0.093 0.000 
0.035) 0.044) 0.074) 0.077) 0.059) 0.061) 0.000) 
WAR Freq (pre 1945); 0.644 0.413 0.433 0.734 -0.431 -0.47 -0.766 
0.431) 0.568) 0.551) 0.581) 0.496) 0.535) 0.478) 
40 x% xk xk 
UN Vote Correlation; 0.049 0.119 0.155 0.232 0.217 0.113 0.099 
0.033) 0.058) 0.075) 0.076) 0.075) 0.070) 0.074) 
Country Fixed Effects Yes Yes Yes Yes Yes Yes Yes 
N 7298 8720 12190 12858 20252 20544 19974 
adj. R? 0.275 0.412 0.44 0.439 0.466 0.417 0.432 


Notes: Two-way cluster-robust standard errors in brackets. * p < 0.1, ** p < 0.05, *** p < 0.01. 


Regressions are based on Equation (3). 
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Table A5. Variable Definitions 


DIF Democracy; j = abs(Democracy;, — Democracy ;t) 
DIF Auto; = abs(Autocracy;, — Autocracy ;,) 

DIF Polity; = abs(Polity;, — Polity) 

DIF Parregijt = abs( Regulation of Participation; — 


Regulation of Participation ;,) 
DIF Parcomp, = abs(Party Competition; — Party Competition ją) 


DIF Polcomp;,x = abs( Political Competition Concept, — 
Political Competition Concept +) 


DIF GDP Per Captia; ; = In(abs(GDP Per Capita; — 
GDP Per Capita) 


DIF GDP,» = In(abs(GDP,, — GDP,,)) 
SUM GDP, = In(GDP,, + GDP,) 


Common Legal Origin; = F country i,j share the same legal origin 


otherwise. 


WAR Duration,; = War_End_Date;, — War_Start_Date;; 


WAR Recentness; 


= p country i, j have war within 20 years. 
agt g 


0 otherwise. 


WAR Freq (pre 1945);; = Years of Bilateral Warı1g70—1945/ (1945 — 1870) 


Table A6. List of Countries 
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AFG AGO ALB ARE ARG ARM ATG AUS AUT AZE BDI BEL 
BEN BFA BGD BGR BHR BHS BLR BLZ BOL BRA BRB BRN 
BTN BUR BWA CAF CAN CHE CHL CHN CIV CMR COD COL 
COM CPV CRI CUB CYP CZE DEU DJI DMA DNK DOM DZA 
ECU EGY ERI ESP EST ETH FIN FJI FRA GAB GBR GEO 
GHA GIN GMB GNB GNQ GRC GRD GTM GUY HKG HND HRV 
HTI HUN IDN IND IRL IRN IRQ ISL ISR ITA JAM JOR 
JPN KAZ KEN KGZ KHM KIR KNA KOR KWT LAO LBN LBR 
LBY LCA LKA LSO LTU LUX LVA MAR MDA MDG MEX MKD 
MLI MLT MNG MOZ MRT MUS MWI MYS NAM NER NGA NIC 
NLD NOR NPL NZL OMN PAK PAN PER PHL PNG POL PRK 
PRT PRY QAT ROU RUS RWA SAU SDN SEN SGP SLB SLE 
SLV SMR SOM SUR SVK SVN SWE SWZ SYC SYR TCD THA 
TJK TKM TON TTO TUN TUR UGA UKR URY USA UZB VCT 
VEN VNM VUT WSM ZAF ZAR ZMB ZWE 

Notes: Table shows the ISO codes of the 176 countries included in our sample. 
Table A7. List of Countries in European Sample (Europe 22) 

Austria Belgium Czech Republic Denmark Finland France 

Germany Greece Hungary Iceland Ireland Italy 

Macedonia Netherlands Norway Poland Portugal Russia 

Spain Sweden Switzerland United Kingdom 


Notes: The sample of 22 European countries is consistent with the sample in the paper of Giuliano, 
Spilimbergo and Tonon (2014). There are 231 (22x21/2) distinct country pairs. Belgium, Iceland, Ireland 
and Netherlands are with zero genetic distance in our sample (4x3/2 distinct pairs). 


Table A8. Descriptive Statistics of the Different Samples 


Variable Mean Std. Dev. Min. Max. 

(Genetic Distance);; (Europe 22) 0.0050186 0.0042103 0.00000 0.011633 
In(Genetic Distance);; (Europe 22) -6.02764 1.564076 -10.7199 -4.45392 
(Genetic Distance);; (Total 176 countries) 0.0369244 0.0185448 0.00000 0.094963 
In(Genetic Distance), ; (Total 176 countries) -3.507772 0.8197276 -10.7199 -2.35427 


Notes: The sample of Europe 22 countries is consistent with the sample in the paper of Giuliano, 


Spilimbergo and Tonon (2014). There are 4 countries with zeroes genetic distance in Europe 22 in our 


sample. 


Table A9. List of Countries of Former USSR Countries 
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ALB ARM AZE BIH BLR BYS CSK CZE DDR DEU EST GEO 
HRV HUN KAZ KGZ LTU LVA MDA MKD MNE POL RUS SCG 
SRB SVN TJK TKM UKR UZB YUG 


Notes: List of ISO codes of 31 countries in the former USSR countries sample. There are 465 (31x30/2) 
distinct country pairs. 
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